7 research outputs found

    Annotating Satire in Italian Political Commentaries with Appraisal Theory

    Get PDF
    This paper presents annotation work to manually classify satire/sarcasm in long commentaries on Italian politics. It is based on Appraisal Theory and uses some 30K word texts. The underlying hypothesis is that using this framework it is possible to pinpoint precisely the deep semantic contents of evaluative judgements and appreciations making up an ironic comment. We performed a high level annotation using major categories, and then refined the classification starting from the lexica derived from the xml annotated files. In this way we succeeded in differentiating texts by the two authors we chose, one of which is characterized by a sharp cutting ironic/almost sarcastic style

    Multi-Word Expressions in spoken language: PoliSdict

    Get PDF
    The term multiword expressions (MWEs) is referred-to a group of words with a unitary meaning, not inferred from that of the words that compose it, both in current use and in technical-specialized languages. In this paper, we describe PoliSdict an Italian electronic dictionary composed of multi-word expressions (MWEs) automatically extracted from a multimodal corpus grounded on political speech language, currently being developed at the "Maurice Gross" Laboratory of the Department of Political Sciences, Social and Communication of the University of Salerno, thanks to a loan from the company Network Contacts. We introduce the methodology of creation and the first results of a systematic analysis which considered terminological labels, frequency labels, recurring syntactic patterns, further proposing an associated ontology.Con il termine polirematica si fa generalmente riferimento ad un gruppo di parole con significato unitario, non desumibile da quello delle parole che lo compongono, sia nell’uso corrente sia in linguaggi tecnico-specialistici. In questo contributo viene presentato PoliSdict un dizionario elettronico in lingua italiana composto da espressioni polirematiche occorrenti nel parlato spontaneo estratte a partire da un corpus multimodale di dominio politico in lingua italiana in corso di ampliamento presso il Laboratorio “Maurice Gross” del Dipartimento di Scienze Politiche, Sociali e della Comunicazione dell’Università degli Studi di Salerno, grazie a un finanziamento della società Network Contacts. Viene presentata la metodologia di creazione ed i primi risultati di un'analisi sistematica che ha considerato etichette terminologiche, marche d'uso e pattern ricorrenti, proponendo infine un’ontologia associata

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Detecting Satire in Italian Political Commentaries

    No full text
    This paper presents computational work to detect satire/sarcasm in long commentaries on Italian politics. It uses the lexica extracted from the manual annotation based on Appraisal Theory, of some 30 K word texts. The underlying hypothesis is that using this framework it is possible to precisely pinpoint ironic content through the deep semantic analysis of evaluative judgement and appreciation. The paper presents the manual annotation phase realized on 112 texts by two well-known Italian journalists. After a first experimentation phase based on the lexica extracted from the xml output files, we proceeded to retag lexical entries dividing them up into two subclasses: figurative and literal meaning. Finally more fine-grained Appraisal features have been derived and more experiments have been carried out and compared to results obtained by a lean sentiment analysis. The final output is produced from held out texts to verify the usefulness of the lexica and the Appraisal theory in detecting ironic content

    A hybrid method for the extraction and classification of product features from user-generated contents

    Get PDF
    The research we present in this paper focuses on the automatic management of the knowledge about experience goods and their features, starting from real texts generated online by internet users. The details about an experiment conducted on a dataset of product reviews, on which we tested a set of rule based and statistical solutions, will be described in the paper. The main goals are the review classification, the extraction of relevant product features and their systematization into product-driven ontologies. Feature extraction is performed through a rule based strategy grounded on SentIta, an Italian collection of subjective lexical resources. Features and Reviews are classified thanks to a distributional semantic algorithm. In the end, we face the problem of the extracted knowledge organization by integrating the subjective information produced by the internet users within a product-driven ontology. The NLP tool exploited in the work is LG-Starship, a hybrid framework for the on Italian texts processing based on the Lexicon-Grammar theory

    An Integrative Bayesian Modeling Approach to Imaging Genetics

    No full text
    In this paper we present a Bayesian hierarchical modeling approach for imaging genetics, where the interest lies in linking brain connectivity across multiple individuals to their genetic information. We have available data from a functional magnetic resonance (fMRI) study on schizophrenia. Our goals are to identify brain regions of interest (ROIs) with discriminating activation patterns between schizophrenic patients and healthy controls, and to relate the ROIs’ activations with available genetic information from single nucleotide polymorphisms (SNPs) on the subjects. For this task we develop a hierarchical mixture model that includes several innovative characteristics: it incorporates the selection of ROIs that discriminate the subjects into separate groups; it allows the mixture components to depend on selected covariates; it includes prior models that capture structural dependencies among the ROIs. Applied to the schizophrenia data set, the model leads to the simultaneous selection of a set of discriminatory ROIs and the relevant SNPs, together with the reconstruction of the correlation structure of the selected regions. To the best of our knowledge, our work represents the first attempt at a rigorous modeling strategy for imaging genetics data that incorporates all such features

    Multi-Word Expressions in spoken language: PoliSdict

    Get PDF
    The term multiword expressions (MWEs) is referred to a group of words with a unitary meaning, not inferred from that of the words that compose it, both in current use and in technical-specialized languages. In this paper, we describe PoliSdict an Italian electronic dictionary composed of multi-word expressions (MWEs) automatically extracted from a multimodal corpus grounded on political speech language, currently being developed at the "Maurice Gross" Laboratory of the Department of Political Sciences, Social and Communication of the University of Salerno, thanks to a loan from the company Network Contacts. We introduce the methodology of creation and the first results of a systematic analysis which considered terminological labels, frequency labels, recurring syntactic patterns, further proposing an associated ontology
    corecore